智能论文笔记

Instance-specific Label Distribution Regularization for Learning with Label Noise

Zehui Liao , Shishuai Hu , Yutong Xie , Yong Xia

分类：计算机视觉

2022-12-16

Modeling noise transition matrix is a kind of promising method for learning with label noise. Based on the estimated noise transition matrix and the noisy posterior probabilities, the clean posterior probabilities, which are jointly called Label Distribution (LD) in this paper, can be calculated as the supervision. To reliably estimate the noise transition matrix, some methods assume that anchor points are available during training. Nonetheless, if anchor points are invalid, the noise transition matrix might be poorly learned, resulting in poor performance. Consequently, other methods treat reliable data points, extracted from training data, as pseudo anchor points. However, from a statistical point of view, the noise transition matrix can be inferred from data with noisy labels under the clean-label-domination assumption. Therefore, we aim to estimate the noise transition matrix without (pseudo) anchor points. There is evidence showing that samples are more likely to be mislabeled as other similar class labels, which means the mislabeling probability is highly correlated with the inter-class correlation. Inspired by this observation, we propose an instance-specific Label Distribution Regularization (LDR), in which the instance-specific LD is estimated as the supervision, to prevent DCNNs from memorizing noisy labels. Specifically, we estimate the noisy posterior under the supervision of noisy labels, and approximate the batch-level noise transition matrix by estimating the inter-class correlation matrix with neither anchor points nor pseudo anchor points. Experimental results on two synthetic noisy datasets and two real-world noisy datasets demonstrate that our LDR outperforms existing methods.

translated by 谷歌翻译

Rethinking Dimensionality Reduction in Grid-based 3D Object Detection

Dihe Huang , Ying Chen , Yikang Ding , Jinli Liao , Jianlin Liu , Kai Wu , Qiang Nie , Yong Liu , Chengjie Wang

分类：计算机视觉 | 机器人

2022-09-20

由于经过验证的2D检测技术的适用性，大多数当前点云检测器都广泛采用了鸟类视图（BEV）。但是，现有方法通过简单地沿高度尺寸折叠的体素或点特征来获得BEV特征，从而导致3D空间信息的重丢失。为了减轻信息丢失，我们提出了一个基于多级特征降低降低策略的新颖点云检测网络，称为MDRNET。在MDRNET中，空间感知的维度降低（SDR）旨在在体素至BEV特征转换过程中动态关注对象的宝贵部分。此外，提出了多级空间残差（MSR），以融合BEV特征图中的多级空间信息。关于Nuscenes的广泛实验表明，该提出的方法的表现优于最新方法。该代码将在出版时提供。

translated by 谷歌翻译

Boundary-Aware Network for Abdominal Multi-Organ Segmentation

Shishuai Hu , Zehui Liao , Yong Xia

分类：计算机视觉

2022-08-29

自动化的腹部多器官分割是计算机辅助诊断腹部器官相关疾病的至关重要但具有挑战性的任务。尽管许多深度学习模型在许多医学图像分割任务中取得了显着的成功，但由于腹部器官的不同大小以及它们之间的含糊界限，腹部器官的准确分割仍然具有挑战性。在本文中，我们提出了一个边界感知网络（BA-NET），以分段CT扫描和MRI扫描进行腹部器官。该模型包含共享编码器，边界解码器和分割解码器。两个解码器都采用了多尺度的深度监督策略，这可以减轻可变器官尺寸引起的问题。边界解码器在每个量表上产生的边界概率图被用作提高分割特征图的注意。我们评估了腹部多器官细分（AMOS）挑战数据集的BA-NET，并获得了CT扫描的多器官分割的平均骰子分数为89.29 $ \％$，平均骰子得分为71.92 $ \％$ \％$ \％ MRI扫描。结果表明，在两个分割任务上，BA-NET优于NNUNET。

translated by 谷歌翻译

Boundary-Aware Network for Kidney Parsing

Shishuai Hu , Yiwen Ye , Zehui Liao , Yong Xia

分类：计算机视觉

2022-08-29

肾脏结构细分是计算机辅助诊断基于手术的肾癌的至关重要但具有挑战性的任务。尽管许多深度学习模型在许多医学图像分割任务中取得了显着的成功，但由于肾脏肿瘤的尺寸可变，肾脏肿瘤及其周围环境之间的歧义范围可变，因此对计算机层析造影血管造影（CTA）图像的肾脏结构的准确分割仍然具有挑战性。。在本文中，我们在CTA扫描中提出了一个边界感知网络（BA-NET），以分段肾脏，肾脏肿瘤，动脉和静脉。该模型包含共享编码器，边界解码器和分割解码器。两个解码器都采用了多尺度的深度监督策略，这可以减轻肿瘤大小可变的问题。边界解码器在每个量表上产生的边界概率图被用作提高分割特征图的注意。我们在肾脏解析（KIPA）挑战数据集上评估了BA-NET，并通过使用4倍的交叉验证来实现CTA扫描的肾脏结构细分的平均骰子得分为89.65 $ \％$。结果证明了BA-NET的有效性。

translated by 谷歌翻译

Label Propagation for 3D Carotid Vessel Wall Segmentation and Atherosclerosis Diagnosis

Shishuai Hu , Zehui Liao , Yong Xia

分类：计算机视觉

2022-08-29

颈动脉血管壁分割是在计算机辅助诊断动脉粥样硬化中的至关重要但具有挑战性的任务。尽管许多深度学习模型在许多医学图像分割任务中取得了显着的成功，但由于注释有限和异构动脉，对磁共振（MR）图像上颈动脉壁（MR）图像的准确分割仍然具有挑战性。在本文中，我们在3D MR图像上提出了一个半监督标签的传播框架，以分段管腔，正常容器壁和动脉粥样硬化血管壁。通过插值提供的注释，我们获得了3D连续标签，用于训练3D分割模型。借助训练有素的模型，我们生成了未标记切片的伪标签，以将其纳入模型训练。然后，我们使用整个MR扫描和传播标签来重新培养分割模型并改善其稳健性。我们评估了颈动脉血管墙分割和动脉粥样硬化诊断（COSMOS）挑战数据集上的标签传播框架，并在测试数据集中获得了83.41 \％的Quanm分数，这使在线评估排行榜上获得了1-ST的位置。结果证明了拟议框架的有效性。

translated by 谷歌翻译

Celeritas: Fast Optimizer for Large Dataflow Graphs

Hengwei Xu , Yong Liao , Haiyong Xie , Pengyuan Zhou

分类：人工智能

2022-07-30

快速扩大的神经网络模型在单个设备上运行越来越具有挑战性。因此，在多个设备上的模型并行性对于确保训练大型模型的效率至关重要。最近的建议在长时间处理时间或性能差。因此，我们提出了Celeritas，这是一个快速的框架，用于优化大型型号的设备放置。Celeritas在标准评估中采用简单但有效的模型并行化策略，并通过一系列调度算法生成位置策略。我们进行实验以在许多大型模型上部署和评估Celeritas。结果表明，与大多数高级方法相比，Celeritas不仅将放置策略生成时间减少26.4 \％，而且还将模型运行时间提高了34.2 \％。

translated by 谷歌翻译

Modeling Human Preference and Stochastic Error for Medical Image Segmentation with Multiple Annotators

Liao Zehui , Hu Shishuai , Xie Yutong , Xia Yong

分类：计算机视觉

2021-11-26

手动注释医学图像是高度主观的，导致不可避免和巨大的注释偏见。深度学习模型可能超过各种任务的人类性能，但它们也可能模仿或放大这些偏差。虽然我们可以有多个注释器并融化它们的注释来减少随机错误，但我们无法使用这种策略来处理因注释器偏好引起的偏差。在本文中，我们突出了对医学图像分割任务的注释相关偏差问题，并提出了涉及涉及的注释分配学习（PADL）框架来解决它从解开注入者的偏好使用分配学习的随机误差的偏好来解决它由于不仅产生元分割，而且产生每个注释器的分割。在此框架下，随机误差建模（SEM）模块估计元分割图和平均随机错误映射，以及一系列人类偏好建模（HPM）模块估计每个注释器的分段和相应的随机误差。我们在具有不同的成像方式的两个医学图像基准上进行了评估了我们的PADL框架，这些模型由多个医疗专业人员注释，并在所有五种医学图像分割任务上取得了有希望的表现。

translated by 谷歌翻译

Domain and Content Adaptive Convolution based Multi-Source Domain Generalization for Medical Image Segmentation

Shishuai Hu , Zehui Liao , Jianpeng Zhang , Yong Xia

分类：计算机视觉

2021-09-13

域间隙主要由可变的医学图像质量引起的构成，这是训练实验室中的分割模型与应用训练的模型在未见临床数据之间的路径上的主要障碍。为了解决这个问题，已经提出了域泛化方法，但是通常使用静态卷积，并且灵活性较低。在本文中，我们提出了一个基于域和内容自适应卷积（DCAC）的多源域概括模型，以分割不同模式的医学图像。具体而言，我们设计了域自适应卷积（DAC）模块和内容自适应卷积（CAC）模块，并将两者都合并到编码器解码器中。在DAC模块中，动态卷积头是根据输入的预测域代码进行的，以使我们的模型适应看不见的目标域。在CAC模块中，动态卷积头在全局图像特征上进行条件，以使我们的模型适应测试图像。我们针对基线的DCAC模型和针对前列腺分割，COVID-19病变分段和视频杯/视盘分段任务的四种最先进的域概括方法评估了DCAC模型。我们的结果不仅表明所提出的DCAC模型在每个分割任务上都优于所有竞争方法，而且还证明了DAC和CAC模块的有效性。代码可在\ url {https://git.io/dcac}上获得。

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译

EZInterviewer: To Improve Job Interview Performance with Mock Interview Generator

Mingzhe Li , Xiuying Chen , Weiheng Liao , Yang Song , Tao Zhang , Dongyan Zhao , Rui Yan

分类：自然语言处理

2023-01-03

Interview has been regarded as one of the most crucial step for recruitment. To fully prepare for the interview with the recruiters, job seekers usually practice with mock interviews between each other. However, such a mock interview with peers is generally far away from the real interview experience: the mock interviewers are not guaranteed to be professional and are not likely to behave like a real interviewer. Due to the rapid growth of online recruitment in recent years, recruiters tend to have online interviews, which makes it possible to collect real interview data from real interviewers. In this paper, we propose a novel application named EZInterviewer, which aims to learn from the online interview data and provides mock interview services to the job seekers. The task is challenging in two ways: (1) the interview data are now available but still of low-resource; (2) to generate meaningful and relevant interview dialogs requires thorough understanding of both resumes and job descriptions. To address the low-resource challenge, EZInterviewer is trained on a very small set of interview dialogs. The key idea is to reduce the number of parameters that rely on interview dialogs by disentangling the knowledge selector and dialog generator so that most parameters can be trained with ungrounded dialogs as well as the resume data that are not low-resource. Evaluation results on a real-world job interview dialog dataset indicate that we achieve promising results to generate mock interviews. With the help of EZInterviewer, we hope to make mock interview practice become easier for job seekers.

translated by 谷歌翻译